Categories

Versions

Sample (In Database) (In-Database Processing)

Synopsis

This operator creates a sample. The size of a sample can be specified on absolute and probability basis.

Description

This operator is similar to the Filter Example Range operator. The number of examples in the sample can be specified on absolute or probability basis depending on the setting of the sample parameter. In case of absolute sample, it is possible to define the exact number of rows to be returned. While in case of probability, the required parameter is the sample probability, which is in [0,1] and defines the returned size of the rows compared to all rows. If it is zero, it is equal to setting the absolute sample to 0. If it is 1, the input is returned. In some databases it is possible to define a seed for the random sample generation. Otherwise, the result may not to be deterministic.

Input

  • example set input

Output

  • example set output

Parameters

  • sample This parameter determines how the amount of data is specified. Range: selection
  • sample_size The number of examples which should be sampled. Range: long
  • sample_probability This parameter specifies the sample probability. Note that neither the sample nor the sample size is guaranteed to be deterministic without a seed value. Range: real
  • use_local_random_seed Indicates if a local random seed should be used. Range: boolean
  • local_random_seed Specifies the local random seed Range: integer